Mining MOOC Lecture Transcripts to Construct Concept 
Dependency Graphs 


* 
Fareedah ALSaad 
University of Illinois at 
Urbana-Champaign 
Urbana-Champaign, USA 


alsaad2@illinois.edu 


Hari Sundaram 
University of Illinois at 
Urbana-Champaign 
Urbana-Champaign, USA 
hs1@illinois.edu 


ABSTRACT 


This paper addresses the question of identifying a concept depen- 
dency graph for a MOOC through unsupervised analysis of lecture 
transcripts. The problem is important: extracting a concept graph 
is the first step in helping students with varying preparation to un- 
derstand course material. The problem is challenging: instructors 
are unaware of the student preparation diversity and may be unable 
to identify the right resolution of the concepts, necessitating costly 
updates; inferring concepts from groups suffers from polysemy; the 
temporal order of concepts depends on the concepts in question. 
We propose innovative unsupervised methods to discover a directed 
concept dependency within and between lectures. Our main tech- 
nical innovation lies in exploiting the temporal ordering amongst 
concepts to discover the graph. We propose two measures—the 
Bridge Ensemble Measure and the Global Direction Measure—to 
infer the existence and the direction of the dependency relations 
between concepts. The bridge ensemble measure identifies concept 
overlap between lectures, determines concept co-occurrence within 
short windows, and the lecture where concepts occur first. The global 
direction measure incorporates time directly by analyzing the con- 
cept time ordering both globally and within lectures. Experiments 
over real-world MOOC data show that our method outperforms the 
baseline in both AUC and precision/recall curves. 
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1. INTRODUCTION 


This paper presents two methods to identify extant concept relation- 
ships in lectures from a Massive Open Online Course (MOOC). 


The problem of concept relationship discovery within MOCCs will 
help adapt to learner diversity where students from all over the globe 
take classes from MOOCs. Developing a fine-grained map of the con- 
cepts presented in the MOOC, indicating pre-requisite relationships, 
can facilitate students browsing into course materials flexibly. In 
addition, such a map can help in emphasizing the important topics in 
the course and how they are related, which can help improve students 
understanding. It can be further used to represent the knowledge 
state of a student at the concept level, and thus enable personalization 
in recommending course materials or quiz questions to students. In 
this paper, our goal is to construct such a map automatically for any 
course in order to accommodate students’ diversity by supporting 
personalized learning. 


Generating such a concept dependency graph presents a number of 
challenges. First, the instructor cannot predict the prior preparation 
of the students taking the class or the granularity at which she 
should develop the concept graph, and ensuring that such a concept 
graph remains up to date every year is time consuming. Second, an 
instructor does not introduce concepts in a rigid order, wherein she 
will always present the prerequisite concept before introducing the 
main concept; which makes it difficult in determining the presence 
and the direction of a relationship between concepts. 


We propose innovative unsupervised methods to discover a directed 
concept dependency graph. We use lecture transcripts, as do Chaplot 
and Koedinger [2], to model the dependency structure between 
course concepts. Where Chaplot and Koedinger focus on modeling 
the prerequisite structure between units or lectures, we instead focus 
on inferring the dependency structure among concepts that appear 
within and between lectures. Our main technical innovation lies 
in exploiting the temporal ordering amongst concepts to discover 
the graph. To the best of our knowledge, we are the first to use 
temporal features to construct the dependency graph. We propose two 
measures—the Bridge Ensemble Measure and the Global Direction 
Measure—to infer the existence and the direction of the dependency 
relations between concepts.Both proposed measures outperform the 
baseline method [2] in AUC and the precision/recall curves. 
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The rest of the paper is organized as follows. In section 2, we formally 
frame our problem before describing the two proposed measures in 
section 3. Section 4 elaborates our approach for the evaluation and 
section 5 presents some limitations. Finally, we discuss some related 
work in section 6 before concluding our work on section 7. 


2. PROBLEM DEFINITION 


Informally, the problem explored in this work can be stated as follows: 
given course data, predict the dependency relationships between the 
course concepts. More formally, let X be the course represented 
by an ordered list of transcripts corresponding to each lecture: 
X = [T|,T>,..., Ty] where M is the total number of lectures. Let Cx 
be the set of concepts discussed in the course Cy = {c1,¢2,...,cn}; 
where N is the total number of unique concepts. Given X and Cy, we 
aim to generate the concept dependency graph that relates concepts 
in Cx according to their prerequisite relationships. The resulting 
concept dependency graph is described by an edge weight matrix 
A € RN*N ach entry a; ; of matrix A will contain the edge weight 
for the associated relationship c; — c;, which means concept c; 
is a prerequisite for concept c;. The edge weight reflects the level 
of confidence in the inferred relationship. Notice that since the 
prerequisite relationship has a direction, A is not symmetric. 


0 as .. Wc, > cn) 
A =| Mle > 1) 0 W(c2 > cn) 
W(cn > c1) Wien > 2) ..- 0 


The problem of constructing the concept dependency graph can be 
reduced to the problem of computing the edge weight between pairs 
of concepts given course data. 


3. LINKING COURSE CONCEPTS 


To relate the course concepts according to their dependency relation- 
ships, we propose two measures: the Bridge Ensemble Measure and 
the Global Direction Measure. 


3.1 Bridge Ensemble Measure 

The Bridge Ensemble Measure (BEM) captures concept dependency 
structure utilizing inter-lecture and intra-lecture strategies. It contains 
three components: Bridges, Sliding Windows, and the First Lecture 
Indicator. 


3.1.1 Bridges 


Let us look at how instructors naturally introduce concepts and their 
prerequisite(s). Let Cx be the set of concepts presented in course X 
and let cg and cp be concepts in that set. Determining the presence of 
a concept cq in a lecture transcript 7; is discussed further in section 
4.1. Suppose that cq is a prerequisite to cp. Then it stands to reason 
that (1) cq will be introduced before cy in the course progression, 
and (2) while explaining or talking about cp, the instructor will 
naturally refer to cq. 


Bridge concepts allow us to exploit the temporal nature of lectures 
to infer concept dependency relationships across lectures. Intuitively, 
bridge concepts are introduced in an earlier lecture but re-appear in a 
later lecture when some new concept(s) are introduced. Accordingly, 
bridge concepts signal a prerequisite relationship from the bridge 
concepts to the new concepts introduced in the later lecture. For 
example, in Figure 1, the bridge concepts c3 and c4 are more likely 
to be prerequisite to concepts c5, cg, and c7 discussed in lecture L. 
Formally, let L; be the set of concepts in the lecture i in course X, 


Figure 1: The bridging concepts (c3 and c4) between 
lecture L, and Ly and the resulting candidate prerequisite 
relationships. 


and L; be the set of concepts for the lecture 7 where j > i. The 
intersection L; 1 L; contains all the concepts that appear in both 
lectures. We call these bridge concepts. The difference L; \ Lj 
contains difference concepts which are the concepts present in the 
later lecture 7 but not in the earlier lecture i. If cq belongs to the 
bridge concepts and cp belongs to the difference concepts, then 
there is evidence for the dependency relationship cg — cp and the 
edge weight W(ca — cp) should increase. As a result, the bridge 
set By = {(Ca > Cy) | Ca € LF NL; A cp € L; \ Lj} contains all 
candidate prerequisite edges from lecture L; to lecture L;. If we 
replicate this exercise for every possible pair of lectures, we will end 
up with a comprehensive set of all possible candidate bridge edges 
Bridges for the course: 


Bridges = Byy(y-1) V Bum -2) UY ..U Boy (1) 


To calculate the edge weight of candidate edges in Bridges, we use 
the following bridge scoring function 


W(ca > Cp) © FBridges(Ca > cp) (2) 


where 


FBridges(Ca > Cp) 
The number of lectures where we observe both cg and cp 
~ The number of lectures where we observe cy 


_ Hbilea, en € £7} 3) 
I{Zylep € Lj} * 
Keep in mind that the bridge scoring function will only calculated 
for candidate edges belong to Bridges. Other pairs of concepts will 
have zero value for the bridging score. 


3.1.2 Sliding Windows 

Bridge edges determined by the Bridge Method do not capture every 
possible prerequisite relationship. Consider the case where concept 
cp has a strong prerequisite cg, but cg and cj, only appear together 
either in the set of bridge concepts (L; 1 L;) or in the set of difference 
concepts (L; \ L;). As aresult, cq — cp will never appear in Bridges 
and hence the Bridge method cannot infer the prerequisite relations 
between them. 


To solve this problem and capture intra-lecture prerequisite rela- 
tionships, we zoom into each lecture and consider the proximity of 
concepts being presented in the lecture. Let L; = [c], C2, .-., Cn] be 
an ordered list of concepts discussed in lecture j, where n is the total 
number of concepts. Keep in mind that this ordered list contains 
redundant concepts which appear in the order where the instructor 
mentioned them. In the sliding windows method, we segment L; 
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Figure 2: A visualization example of lecture L; with 
r =n-—K +1 sliding windows of size K = 4. The sliding 
windows captures the proximity of concepts. 


into windows W; = [cj,..., Ci+K—1] as follows: 
{W;|1<i<n-K+l} n>K 
Wind ,=4(s 4 
seaaael | nex, © 


Figure 2 depicts the representation of lecture L; usingr =n-K+1 
windows of size K = 4. In this study, we choose the K that gives the 
best performance; K is set to 10 concepts. 


The more windows in which cg and cp appear together, the stronger 
the relationship between cg and cz is; thus the edge weight should 
increase. The second component of the BEM for edge weights is 
the probability of the edge cg — cy given the information we have 
about all windows in all lectures Windows = |); Windows;. 


W(ca > Cp) © FRridges(Ca — cp) + Fwindows(Ca > Cy) (5) 
Where: 


Fwindows(Ca > Cp) 
_ The number of windows where we observe cq and cy together 
> The number of windows where we observe cp 
_ [{W; € Windows | ca, cp € Wi}! 
|{W; € Windows | cy € W;}| 


(6) 


We choose to accumulate the bridge weight with the sliding windows 
weight because these methods complement each other. Some edges 
that captured by the sliding windows method have zero bridging 
score and vice versa. Multiplying these two components instead of 
accumulating them would eliminate their effect in capturing inter- 
and intra-lecture prerequisite edges as the value of these edges will 
be zero. 


3.1.3 First Lecture Indicator 

The third component of the BEM for edge weights comes from the 
intuition that the context (other observed concepts) in which a new 
concept cy is first introduced plays a strong role in determining what 
the prerequisite concepts of cp are. We will assume that cz is first 
introduced in lecture j when it has the highest term frequency of the 
concept cp, compared to other lectures. We call j the lecture indicator 
of cp and denote it by L/(c;,). When concept cg appears in the lecture 
indicator of cp (ca € Ly y(c,)), then cq might be a prerequisite to 
cp. Another condition we need to examine is the temporal order of 
the lecture indicator of concept cq. Naturally, when the instructor 
discusses a new concept, he or she needs to explain its prerequisite 
concepts beforehand, either in earlier lectures or in the same lecture 
where the new concept is being introduced. More formally, then, 
LI(ca) < LI(cp). Thus when calculating W(ca — cp) we consider 
the first lecture indicator variable FLJ.,¢, where: 

1, ifca € Lyye,) and LI(cq) < LI(cp) 


FLIi,. ¢, = 
arte f otherwise 


Global Direction C1 


Indicator 
(Before Normalization) |C2| 1 | 9 | 1 


Figure 3: A visualizing explanation of the Global Di- 
rection Indicator. X represents the course. The matrix 
contains the Global Direction Indicator (Before the nor- 
malization). Each element in the matrix represents how 
many times the concept Crow appears before the concept 
Ceolumn in the whole entire course. 


The BEM for edge weights now becomes: 


Wca > cp) © FRridges(Ca > cp) 
+ Fwindows(Ca > Ch) + FLIcg,cp (7) 


3.2 Global Direction Measure 


The Global Direction Measure (GDM) is an alternative measure 
we propose to capture the dependency relationships between course 
concepts by incorporating time directly to consider not only the time 
ordering within lectures but also globally throughout the course 
delivery. In the Bridge Ensemble Measure, one problem with the 
sliding windows method is that the temporal order of concepts within 
a window W; is ignored. This seems reasonable since in a single 
window, the instructor might mention the dependent concept before 
the prerequisite concepts. However, utilizing the temporal order of 
concepts in the entire course might improve the inference of the 
direction of the dependency relation. Thus, we propose the idea of 
the Global Direction Indicator (GDI). 


The global direction indicator keeps track of the global temporal 
order frequency of concepts discussed in the course. In other words, 
it captures how many times concept cq appears before concept cp in 
the whole entire course. The more the concept cg appears before the 
concept cp, the more likelihood that the direction of the prerequisite 
relation is from cg to cp (Cg — cp). To capture the global direction 
indicator, we represent the course X as an ordered list of concepts dis- 
cussed in all course lectures: X = [C115 C125 «+45 Cijs + CM 1, CM2, «= 
where i is the lecture number, j is the concept number, and M is 
the total number of lectures. Then, we keep track of temporal order 
frequency between any pair of concepts in the whole entire course. 
Figure 3 depicts the idea of the global direction indicator. 


The formula of the global direction indicator is as follow: 
TOF (ca > Cp) 
DereCx TOF Ca > &) 


where TOF is the temporal order frequency, c; are all concepts 
appear after c, in the course progression. We normalize the TOF of 
Ca — Cp by the total number of times cg appears before any other 
concept in the course to reduce the impact of popular concepts that 
tend to appear before almost every other concept in the course. 


GDI(ca, cp) = 


(8) 


In addition to the global direction indicator, we modify the sliding 
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windows method to consider the local temporal order of concepts 
within a single window: 


FDir-Windows(Ca > Cb) 
The number of windows where we observe cq — Cp 


The number of windows where we observe cp 
_ [{W; € Windows | ca — cp € W;}| 
|{W; € Windows | cy € W;}| 


(9) 


In this case, the directed sliding windows (Dir-Windows) method 
captures not only the proximity of pair of concepts but also the local 
direction within lectures while the global direction indicator captures 
the frequency of the global direction. 


The edge weight function according to the GDM is as follow: 


W(ca > cp) * GDI(Ca, Cp) X Fpir-Windows(Ca > Cb) (10) 


The rationale behind combining the GDM Components by multiply- 
ing them instead of accumulating them is to use the global direction 
indicator to improve the direction of edges predicted by the directed 
windows instead of predicting the existence of edges. The problem 
with the global direction indicator in predicting the edge existence 
is that it might give high weight to concepts that appear very often 
with the same direction order even if they do not appear together in 
any lecture. 


4. EVALUATION 


In this section, we demonstrate the evaluation process conducted 
to assess the performance of the proposed measures. We utilize the 
course “Text Retrieval and Search Engines”! to construct the concept 
dependency graph to evaluate our developed measures. 


4.1 Building the Course Concept Space 

The focus of our work is on understanding how to infer the dependency 
relationship between concepts, but in order to evaluate the proposed 
measures, we must first construct a set of concepts. There is a wide 
body of work which attempts to solve the problem of defining and 
inferring concepts [3, 9, 10]. In this paper, we use a pre-trained part- 
of-speech- guided phrasal segmentation, called Autophrase [10, 8], to 
extract salient phrases from lectures’ transcripts. While Autophrase 
generates many good salient phrases, some phrases are either too 
general or are verb phrases. Our approach to improve the quality 
of the selected phrases is to extract phrases from weekly overviews 
using the same phrasal segmentation method. At the beginning of 
each week in the course, there is a week overview page that explains 
the goals and objectives of that week along with the key phrases and 
concepts that students need to understand. Utilizing the overview 
page of each week aids in filtering out meaningless phrases. 


After extracting salient phrases, we manually group synonym phrases 
together to construct a concept. We follow Siddiqui et al. [11] 
definition of concepts by defining a concept as a set of salient phrases 
that describe it. This design decision was made to allow for flexibility 
in concept description since the same concept can be referred to 
using different phrases by different people. 


4.2 Ground Truth 


To evaluate the effectiveness of the proposed measures, we form 
a ground truth concept graph by leveraging students submissions 


https ://www.coursera.org/Learn/text-retrieval 
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Figure 4: The visualization of the ground truth graph 
generated from students data. 


about concept dependencies in a course (CS 410) at UIUC offered in 
the Spring 2017 semester that follows Coursera’s “Text Retrieval and 
Search Engines.” Students were asked to submit a weekly summary 
of new concepts they have learned along with prerequisite concepts. 
The following is an example of a student entry from week 3: 


# f-measure: precision, recall 
# pr curve: precision, recall 
# map: arithmetic mean of average precision 
# gmap: geometric mean of average precision 


The total number of edges in the ground truth were 239 edges for 
74 concepts in the concept space. Figure 4 visualizes the ground 
truth concept graph to see how concepts are related. It is clear that 
concepts such as “information retrieval’, “search engines”, “ranking 
function’, and “evaluation methodology” have higher degree as these 
concepts are connected with many other concepts in the course. This 
is reasonable as these concepts considered fundamental in this course. 
Such a figure can also be seen as a useful topic map that can facilitate 
students browsing into course materials covering different topics 
flexibly; however, the map shown in this figure was constructed 
based on student submissions—with the proposed methods, we can 
construct such a map automatically for any course. 


4.3 Baseline Approach 

Since the problem formulation of using only transcripts to predict 
concept dependency is novel, strictly speaking, no previous method 
can be directly used to produce the desired output. The closest work 
that we can compare with is the work of Chaplot and Koedinger [2], 
which also only uses course content without any external knowledge. 
In their paper, they develop two methods: a text-based method called 
the overlap method, and a performance-based method. Since our 
work is a text-based method, we compare our measures to the overlap 
method. The main difference between our work and the overlap 
method is that we exploit the temporal features of course delivery 
while the overlap method does not; this makes the overlap method 
an ideal baseline to study the effect of the temporal features on the 
accuracy of edge prediction. 


The overlap method, however, only predicts the prerequisite relations 
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Table 1: Performance (area under ROC curve) of concept 
graph generation for the three methods considered. Both 
of the new measures introduced in the paper outperform 
the state-of-the-art ExtendedOverlap method on both edge 
existence and edge direction tasks. 


AUC (ROC) 
Method Existence Direction 
Bridge Ensemble Measure 0.80 0.81 
Global Direction Measure 0.80 0.78 
ExtendedOverlap Method 0.74 0.74 


(a) Existence Evaluation (b) Direction Evaluation 
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Figure 5: The Precision/Recall Curves of Bridge Ensemble 
Measure (BEM), Global Direction Measure (GDM), and 
the baseline ExtendedOverlap Method (EOM). GDM and 
BEM outperform the baseline method (EOM) in both the 
existence evaluation and direction evaluation. 


between units (e.g. lectures) using the text overlap between units. 
Thus the method cannot be used directly to predict dependency 
between concepts, the problem that we attempt to solve. Therefore, 
we propose an extension called ExtendedOverlap for solving our 
problem as a baseline for comparison. Our main idea for extending the 
overlap method is to first map a course to a set of lectures where the 
concept occurred and then leverage the lecture dependency relations 
predicted using the overlap method to assess the dependency between 
two concepts by accumulating the weight of the dependency relations 
of lectures they belong to. All weights are normalized to be between 
zero and one. We implemented the overlap method using the noun 
phrases with document frequency normalization since they achieve 
the highest performance [2]. 


4.4 Concept Graph Performance 

We conduct the evaluation of the performance of the generated con- 
cept graphs over two dimensions: edge existence and edge direction. 
Edge existence evaluates whether the method predicts correct edges 
or not while edge direction evaluation ensures not only the correct- 
ness of the edge prediction but also their direction. The AUC values 
of all the methods are shown in Table 1. We can notice that both the 
Bridge Ensemble Methods (BEM) and Global Direction Measure 
(GDM) outperform the baseline ExtendedOverlap (EOM) in terms 
of the AUC values for both the existence task and the direction task. 


We also use the precision/recall curve to compare various methods 
as shown in Figure 5. It appears that the Global Direction Measure 
has the highest curve followed by the Bridge Ensemble Measure 
in both dimensions. This indicates that for various recall values 
our measures predict more accurate edges than the baseline. It is 


(a) Existence Evaluation (b) Direction Evaluation 
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Figure 6: The comparison between the performance of 
the Bridge Ensemble Measure components. While the 
undirected sliding windows correctly captured the edge 
existence in the interval [0.0, 0.2], it fails at predicting 
edge directions. 


also interesting to notice that in the precision/recall curve of the 
existence evaluation (Figure 5 (a)), the Bridge Ensemble Measure 
has the highest precision when the recall is less than 0.1 while 
in the precision/recall curve of the direction evaluation (Figure 5 
(b)) has the lowest precision until it reaches the recall value of 0.2. 
This indicates that, in the interval [0.0,0.2], the Bridge Ensemble 
method captures the existence of the edges but fails at specifying the 
correct direction. To examine the reason, we study the performance 
of various components of the Bridge Ensemble Measure as depicted 
in Figure 6. It is appear that the undirected sliding windows method 
has the highest curve in the existence evaluation (Figure 6 (a)) and 
since it only captures the proximity of pair of concepts and how 
they are related, it surges the precision/recall curve of the existence 
performance in the interval [0.0,0.2] by capturing correct prerequisite 
edges. However, since the temporal feature is only used in limited 
way as a binary variable among lectures through bridges and first 
lecture indicator components, it sometimes fails at predicting the 
correct direction of edges between concepts that only appears within 
the same lectures. In contrast, the Global Direction Measure exploits 
the global direction indicator that keeps track of the global temporal 
order frequency and hence emphasizes or corrects the direction 
captured by the directed sliding windows method as depicted in 
Figure 7. It is clear from Figure 7 that the global direction indicator 
improves the edge direction of the directed windows method when 
the recall value is less than 0.2 while it emphasize the edge direction 
of the directed windows after that. 


To further analyze the differences between the Bridge Ensemble Mea- 
sure and the Global Direction Measure, we examine their behavior 
in the existence dimension. We found that all true positive edges and 
false positive edges captured by Global Direction measure are also 
captured by Bridge Ensemble Measure. However, Bridge Ensemble 
Measure has more false positive edges (59 edges) and more true 
positive edges (only 4 edges). We examine the source of the extra 
false positive edges in the Bridge Ensemble Measure and found that 
73% came from the bridge method, 3% came from the first lecture 
indicator, and 22% are from both the bridge method and the first 
lecture indicator while the sliding windows has zero contribution 
(0%). Further examination of these extra false positive errors shows 
that some of them capture long distance dependencies such as the 
relation “natural language processing” — “recommender systems”, 
which captures the dependency between the concepts explained in 
the first and last lectures. By examining the source of this relation, we 
found that the bridge method makes the inference of the relation. As 
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Figure 7: The effect of the global direction indicator on 
the Global Direction Measure. The GDI improves the edge 
direction of the directed windows method when the recall 
value is less than 0.2 while it emphasize the edge direction 
of the directed windows after that 


mentioned earlier, bridge method captures the dependency relations 
between concepts across lectures and, in contrast to the sliding win- 
dows method, it does not require the proximity of concepts within 
lectures’ transcripts. This property of the bridge method gives the 
Bridge Ensemble Measure the ability to capture long distance rela- 
tions between concepts in contrast to the Global Direction Measure 
which only captures the local dependencies between concepts (within 
lectures). 


We also conduct a qualitative analysis of the false positive edges to 
examine the reason of the high values and hence the low precision 
values. We found three types of false positive edges that we may 
actually consider correct relations. First, the transitive property edges 
that are captured by our measures are not always specified in the 
ground truth edges. For example, students specify the relations 
“length normalization” — “ranking function’, and “ranking function” 
— “vector space model”. While both our measures and the baseline 
capture these relations, they go further and also capture the transitive 
relation “length normalization” — “vector space model’. Second, 
there are issues with relations with differing concept granularities. 
For instance, students specify a dependency relation “language 
models” — “‘dirichlet prior smoothing” while the generated graphs 
by the three methods capture the relation “language models” — 
“smoothing methods.” The concept “smoothing methods” is more 
general than the concept “dirichlet prior smoothing.” Third, there 
are missing “true” relations that the students did not specify in the 
ground truth. For example, students did not specify the following 
relations that are captured by our measures: “tfidf’ — “bm25”, and 
“Jength normalization” — “bm25.” In general, the three types of false 
positive errors can justify to some extent the high values of the false 
positive errors and thus the low values of the precision. 


In general, the Bridge Ensemble Measure and the Global Direc- 
tion Measure outperform the baseline in terms of AUC and preci- 
sion/recall curves, with the Global Direction Measure having the 
overall highest performance. These results emphasize the positive 
effect of the temporal feature on improving the accuracy of the 
generated concept graph. 


5. LIMITATIONS 


There are some limitations in our study. First, in the evaluation we 
have not examined the robustness of our measures compared to 
the baseline utilizing other courses taught by different instructors. 
Second, we use the students’ perspectives of the concept dependency 


graph as a ground truth, and we are the first study to do so. However, 
in the future we plan to compare various methods’ performance by 
utilizing not only the students’ perspectives of the concept graph but 
also one generated by instructors. Third, in this study, we include an 
edge in the ground truth even if only one student specifies it; in the 
future we plan to use some agreement measures before including an 
edge in the ground truth. Fourth, we represent the course concept 
graph according to the dependency structure without distinguishing 
whether the dependency relation captures the hierarchical structure 
or real prerequisite relationships. We believe that the ideal structure 
of the concept dependency graph is a hierarchical graph with cross 
link edges where the hierarchical structure captures the “general 
concept” to “specific concept” relations while the cross links depict 
the prerequisite relationships between concepts. 


6. RELATED WORK 


Most prior work focuses on relationships between concepts such 
as similarity relations [13] and hierarchical relations [5]. Although 
the most important concept relation to learners is the dependency 
or prerequisite relation, this relation has been the least studied [4]. 
Some prior works utilize Wikipedia articles [6, 12, 1, 7], scien- 
tific corpora [4], or educational materials from online educational 
platforms [14, 2, 7] to model the dependency structure between 
concepts. While many studies utilized external knowledge to recover 
the prerequisite relations [14, 7] , Chaplot and Koedinger [2] utilize 
the course content with students performance to infer such relation. 
In contrast, to make our method more accessible, we exploit only 
the easily accessible educational materials to model the dependency 
relations among course concepts. 


Previous research represents graph concepts in various ways. Gordon 
et al. [4] identify concepts using LDA topic modeling that fails in 
identifying finer-grained concepts. Yang et al. [14] explored four 
different representations and found that word and category represen- 
tations have similar performance; however, word representation has 
slightly better performance on some data sets. One problem with 
using category representations is that mapping phrases to Wikipedia 
categories affects concept granularities by preferring more general 
concepts. On the other hand, Chaplot and Koedinger [2] found that 
noun phrase representation outperforms other representations. There- 
fore, in this study, we utilize noun phrase representation but extend 
it using temporal information. 


Previous work developed supervised [1, 12, 14] and unsupervised 
approaches [6, 7, 2] to predict the dependency relationships among 
concepts. Several studies rely on external knowledge to predict 
prerequisite relations across courses [14, 7] while we only lever- 
age course materials to model the dependency relations within a 
course not between courses. Chaplot and Koedinger [2] address the 
dependency structure within courses, but between units instead of 
concepts taught within units. Another main difference is the use of 
the temporal feature in the course delivery to model the dependency 
structure as we are the first study that exploits the temporal feature. 


7. CONCLUSIONS 


In this paper, we leverage the accessible MOOC content and in- 
corporate the temporal feature of the course to construct a concept 
dependency graph. We developed Bridge Ensemble Measure and 
Global Direction Measure that exploit the temporal order in course 
delivery to model the dependency structure. We revealed in the eval- 
uation that both developed measures outperform the baseline method 
in AUC and in precision recall curves. This finding emphasizes the 
positive effect of utilizing the temporal feature of course progression. 
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